Performance Gain with Variable Chunk Size in GFS-like File Systems

نویسندگان

  • Zhifeng YANG
  • Qichen TU
  • Kai FAN
  • Lei ZHU
  • Rishan CHEN
  • Bo PENG
چکیده

We have designed and implemented Tianwang File System(TFS), which is a distributed file system much like Google file system(GFS). The system has its origins in our Tianwang search engine and web mining research work. Our system has the same assumptions and the same architectures with GFS. But the key design choice that the chunk size is variable lets our system to adopt simpler system interactions which significantly improves the performance of the record append operation. In this paper, we discuss many aspects of our design which are different from GFS, and verify their pros and cons by performance experiments. The experiment results shows that the utilization ratio of our record append operation excels GFS by 25%. And the throughput of record append of TFS is also several times better.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of Google file system to support Big Data

After doing research on Google File System, we find out some methods to improve the performance of Google file system. Google File System is a scalable distributed file system for large size distributed data-intensive applications. It provides high fault tolerance while running on inexpensive commodity hardware and it delivers high aggregate performance to a large number of clients. But there a...

متن کامل

Implementing Journaling in a Linux Shared Disk File System

In computer systems today, speed and responsiveness is often determined by network and storage subsystem performance. Faster, more scalable networking interfaces like Fibre Channel and Gigabit Ethernet provide the scaffolding from which higher performance computer systems implementations may be constructed, but new thinking is required about how machines interact with network-enabled storage de...

متن کامل

HBase and Hypertable for large scale distributed storage systems A Performance evaluation for Open Source BigTable Implementations

BigTable is a distributed storage system developed at Google for managing structured data and has the capability to scale to a very large size: petabytes of data across thousands of commodity servers. As now, there exist two open-source implementations that closely emulate most of the components of Google’s BigTable i.e. HBase and Hypertable. HBase is written in Java and provides BigTable like ...

متن کامل

Variable Chunk Based Parallel Switching To Minimizing File Download Time in P2P Network

The Peer-to-peer (P2P) computing has been one of the emerging technologies in distributed file sharing. Experimental studies show that for a file download, service capacity fluctuation takes minutes to several hours. For a P2P one of the fundamental performances metric is the average download time. The common approach to analyse the average download time is average service capacity. Heterogenei...

متن کامل

A 64-bit, Shared Disk File System for Linux

In computer systems today, speed and responsiveness is often determined by network and storage subsystem performance. Faster, more scalable networking interfaces like Fibre Channel and Gigabit Ethernet provide the scaffolding from which higher performance implementations may be constructed, but new thinking is required about how machines interact with network-enabled storage devices. We have de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008